Goto

Collaborating Authors

 scalable bayesian optimization


Scalable Bayesian Optimization via Focalized Sparse Gaussian Processes

Neural Information Processing Systems

Bayesian optimization is an effective technique for black-box optimization, but its applicability is typically limited to low-dimensional and small-budget problems due to the cubic complexity of computing the Gaussian process (GP) surrogate. While various approximate GP models have been employed to scale Bayesian optimization to larger sample sizes, most suffer from overly-smooth estimation and focus primarily on problems that allow for large online samples. In this work, we argue that Bayesian optimization algorithms with sparse GPs can more efficiently allocate their representational power to relevant regions of the search space. To achieve this, we propose focalized GP, which leverages a novel variational loss function to achieve stronger local prediction, as well as FocalBO, which hierarchically optimizes the focalized GP acquisition function over progressively smaller search spaces. Experimental results demonstrate that FocalBO can efficiently leverage large amounts of offline and online data to achieve state-of-the-art performance on robot morphology design and to control a 585-dimensional musculoskeletal system.


Multiple Adaptive Bayesian Linear Regression for Scalable Bayesian Optimization with Warm Start

Perrone, Valerio, Jenatton, Rodolphe, Seeger, Matthias, Archambeau, Cedric

arXiv.org Machine Learning

Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization. Typically, BO is powered by a Gaussian process (GP), whose algorithmic complexity is cubic in the number of evaluations. Hence, GP-based BO cannot leverage large amounts of past or related function evaluations, for example, to warm start the BO procedure. We develop a multiple adaptive Bayesian linear regression model as a scalable alternative whose complexity is linear in the number of observations. The multiple Bayesian linear regression models are coupled through a shared feedforward neural network, which learns a joint representation and transfers knowledge across machine learning problems.


Scalable Bayesian Optimization Using Deep Neural Networks

Snoek, Jasper, Rippel, Oren, Swersky, Kevin, Kiros, Ryan, Satish, Nadathur, Sundaram, Narayanan, Patwary, Md. Mostofa Ali, Prabhat, null, Adams, Ryan P.

arXiv.org Machine Learning

Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires many evaluations, and as such, massively parallelizing the optimization. In this work, we explore the use of neural networks as an alternative to GPs to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we apply to large scale hyperparameter optimization, rapidly finding competitive models on benchmark object recognition tasks using convolutional networks, and image caption generation using neural language models.